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FUNCTIONAL DOMAINS IN FLAVOBACTERIUM OKEANOKOITES 
(FORI) RESTRICTION ENDONUCLEASB 



BACKGROUND OF THE INVENTION 

1. Field of the Invention ; 

The present invention relates to the Fokl 
restriction endonuclease system. In particular, the 
present invention relates to DNA segments encoding 
the separate functional domains of this restriction 
endonuclease system. 

The present invention also relates to the 
construction of two insertion mutants of Fokl 
endonuclease. 

2. Background Information : 

Type II endonucleases and modification 
methylases are bacterial enzymes that recognize 
specific sequences in duplex DNA. The endonuclease 
cleaves the DNA while the methylases methylates 
adenine or cytosine residues so as to protect the 
host-genome against cleavage [Type II restriction 
and modification enzymes. In Nucleases (Eds. Modrich 
and Roberts) Cold Spring Harbor Laboratory, New 
York, pp. 109-154, 1982]. These restriction- 
modification (R-M) systems function to protect cells 
from infection by phage and plasmid molecules that 
would otherwise destroy them. 
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As many as 2500 restriction enzymes with 
over 200 specificities have been detected and 
purified (Wilson and Murray, Annu. Rev. Geng^ 
25:585-627, 1991). The recognition sites of most of 
5 these enzymes are 4-6 base pairs long. The small 
size of the recognition sites is beneficial as the 
phage genomes are usually small and these small 
recognition sites occur more frequently in the 
phage. 

10 Eighty different R-M systems belonging to 

the Type IIS class with over 35 specificities have 
been identified. This class is unique in that the 
cleavage site of the enzyme is separate from the 
recognition sequence. Usually the distance between 

15 the recognition site and the cleavage site is quite 
precise (Szybalski et al., Gene , 100:13-26, 1991). 
Among all these enzymes, the Fokl restriction 
endonuclease is the most well characterized member 
of the Type IIS class. The Fokl endonuclease 

20 (RFokl) recognizes asymmetric pentanucleotides in 
double-stranded DNA, 5* GGATG-3 • (SEQ ID NO: 1) in 
one strand and S'-CCTAC-S 1 (SEQ ID NO: 2) in the 
other, and introduces staggered cleavages at sites 
away from the recognition site (Sugisaki et al., 

25 Gene 16:73-78; 1981). In contrast, the Fokl 

methylase (MFokl) modifies DNA thereby rendering the 
DNA resistant to digestion by FoJcI endonuclease. 
The FoJcI restriction and modification genes have 
been cloned and their nucleotide sequences deduced 

30 (Kita et al., J. of Biol. Chem. . 264:575-5756, 
1989). Nevertheless, the domain structure of the 
FoJcI restriction endonuclease remains unknown, 
although a three domain structure has been suggested 
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(Wilson and Murray, Annu. Rev , Genet. 25:585-627, 
1991) . 

SUMMARY OF THE INVENTION 
Accordingly, it is an object of the 
5 present invention to provide isolated domains of 
Type IIS restriction endonuclease. 

It is another object of the present 
invention to provide hybrid restriction enzymes 
which are useful for mapping and sequencing. 
10 An additional object of the present 

invention is to provide two insertion mutants of 
FOKI which have an increased distance of cleavage 
from the recognition site as compared to the wild- 
type enzyme. The polymerase chain reaction (PCR) is 
15 utilized to construct the two mutants. 

Various other objects and advantages of 
the present invention will become obvious from the 
drawings and the following description of the 
invention. 

20 In one embodiment, the present invention 

relates to a DNA segment encoding the recognition 
domain of a Type IIS endonuclease which contains the 
sequence-specific recognition activity of the Type 
IIS endonuclease or a DNA segment encoding the 

25 catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of the Type IIS 
endonuclease. 

In another embodiment, the present 
invention relates to an isolated protein consisting 

30 essentially of the N- terminus or recognition domain 
of the Folcl restriction endonuclease which protein 
has the sequence-specific recognition activity of 
the endonuclease or an isolated protein consisting 
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essentially of the C-terminus or catalytic domain of 
the Fokl restriction endonuclease which protein has 
the nuclease activity of the endonuclease. 

In a further embodiment, the present 
5 invention relates to a DNA construct comprising a 

first DNA segment encoding the catalytic domain of a 
Type IIS endonuclease which contains the cleavage 
activity of the Type IIS endonuclease; a second DNA 
segment encoding a sequence-specific recognition 

10 domain other than the recognition domain of the Type 
IIS endonuclease; and a vector. In the construct, 
the first DNA segment and the second DNA segment are 
operably linked to the vector to result in the 
production of a hybrid restriction enzyme. 

15 In another embodiment, the present 

invention relates to a hybrid restriction enzyme 
comprising the catalytic domain of a Type IIS 
endonuclease which contains the cleavage activity of 
the Type IIS endonuclease linked to a recognition 

20 domain of an enzyme or a protein other than the Type 
IIS endonuclease from which the cleavage domain is 
obtained. 

In a further embodiment, the present 
invention relates to a DNA construct comprising a 

25 first DNA segment encoding the catalytic domain of a 
Type IIS endonuclease which contains the cleavage 
activity of the Type IIS endonuclease/ a second DNA 
segment encoding a sequence-specific recognition 
domain other than the recognition domain of the Type 

30 lis endonuclease; a third DNA segment comprising one 
or more codons, wherein the third DNA segment is 
inserted between the first DNA segment and the 
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second DNA segment; and a vector. Preferably, the 
third segment contains four or seven codons. 

In another embodiment, the present 
invention relates to a procaryotic cell comprising a 
5 first DNA segment encoding the catalytic domain of a 
Type IIS endonuclease which contains the cleavage 
activity of the Type IIS endonuclease; a second DNA 
segment encoding a sequence-specific recognition 
domain other than the recognition domain of the Type 

10 IIS endonuclease; a third DNA segment comprising one 
or more codons, wherein the third DNA segment is 
inserted between the first DNA segment and the 
second DNA segment; and a vector. The first DNA 
segment and the second DNA segment are operably 

15 linked to the vector so that a single protein is 
produced . 

PRIEF DESCRIPTION OF TflE DRAWINGS 
FIGURE 1 shows sequences of the 5 V and 3 9 
primers used to introduce new translation signals 

20 into fokIM and fokIR genes during PCR amplification. 
(SEQ ID NOs: 3-9). 3D represents Shlne-Dalgarno 
consensus RBS for Escherichia coli (E. coli) and 7- 
bp spacer separates the RBS from the ATG start 
condon. The fokIM primers are flanked by Ncol 

25 sites. The fokIR primers are flanked by BamHI 
sites. Start and stop codons are shown in bold 
letters. The 18 -bp complement sequence is 
complementary to the sequence immediately following 
the stop codon of Mfokl gene. 

30 FIGURE 2 shows the structure of plasmids 

pACYCtffoJcJM, pRRSRfolcZR and pCB fokIR. The PCR- 
modif ied fokIM gene was inserted at the Ncol site of 
pACYC184 to form pACYOf ok IM . The PCR-generated 
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fokIR gene was inserted at the BamHI sites of pRRS 
and pCB to form pBRSfokIR and pCB fokIR, 
respectively. pRRS possesses a lac UV5 promoter and 
pCB contains a strong tac promoter. In addition, 
5 these vectors contain the positive retroregulator 
sequence downstream of the inserted fokIR gene. 

FIGURE 3 shows SDS (0.1%) - polyacrylamide 
(12%) gel electrophoretic profiles at each step in 
the purification of Fokl endonuclease. Lanes: 1, 

10 protein standards; 2, crude extract from uninduced 

cells; 3, crude extract from cells induced with 1 mM 
IPTG; 4, phosphocellulose pool; 5, 50-70% (NHO2SO4 
fractionation pool; and 6, DEAE pool. 

FIGURE 4 shows SDS (0.1%) - polyacrylamide 

15 (12%) gel electrophoretic profiles of tryptic 
fragments at various time points of trypsin 
digestion of Fokl endonuclease in presence of the 
oligonucleotide DNA substrate, d-5 1 -CCTCTGGATGCTCTC- 
3»(SEQ ID NO: 10): 5 1 -GAGAGCATCCAGAGG-3 * (SEQ ID 

20 NO: 11). Lanes: 1, protein standards; 2, Fokl 

endonuclease; 3, 2.5 min; 4 # 5 min; 5, 10 min; 6, 20 
min; 7, 40 min; 8, 80 min; 9, 160 min of trypsin 
digestion respectively. Lanes 10-13: HPLC purified 
tryptic fragments. Lanes: 10, 41 kDa fragment; 11, 

25 30 kDa fragment; 12, 11 kDa fragment; and 13, 25 kDa 
fragment. 

FIGURE 5 shows the identification of DNA 
binding tryptic fragments of Fokl endonuclease using 
an oligo dT-cellulose column. Lanes: 1, protein 
30 standards, 2, FoJcl endonuclease; 3, 10 min trypsin 
digestion mixture of Fokl - oligo complex; 4, 
tryptic fragments that bound to the oligo dT- 
cellulose column; 5, 160 min trypsin digestion 
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mixture of Fokl - oligo complex; 6, tryptic 
fragments that bound to the oligo dT-cellulose 
column. 

FIGURE 6 shows an analysis of the cleavage 
5 properties of the tryptic fragments of Fokl 
endonuc lease. 

(A) The cleavage properties of the 
tryptic fragments were analyzed by agarose gel 
electrophoresis. 1 /ig of pTZ19R in lOmM Tris.HCl 

- 10 (pH 8), 50mM NaCl, ImM DTT, and lOmM MgCl 2 was 

digested with 2 pi of the solution containing the 
fragments (tryptic digests, breakthrough and eluate 
respectively) at 37 °C for 1 hr in a reaction volume 
of 10 pi. Lanes 4 to 6 correspond to trypsin 

15 digestion of Fok J- oligo complex in absence of 

MgCl 2 . Lanes 7 to 9 correspond to trypsin digestion 
of Fokl - oligo complex in presence of 10 mM MgCl 2 . 
Lanes: 1, 1 kb ladder; 2, pTZ19R; 3, pT219R 
digested with Fokl endonuclease ; 4 and 6, reaction 

20 mixture of the tryptic digests of Fo*I - oligo 

complex; 5 and 7, 25 kDa C-terminal fragment in the 
breakthrough volume; 6 and 9, tryptic fragments of 
Fokl that bound to the DEAE column. The intense 
bands at bottom of the gel correspond to excess 

25 oligonucleotides • 

(B) SDS (0.1%) - polyacrylamide (12%) gel 
electrophoretic profiles of fragments from the DEAE 
column. Lanes 3 to 5 correspond to trypsin 
digestion of Fokl - oligo complex in absence of 

30 MgCl 2 . Lanes 6 to 8 correspond to trypsin digestion 
of Fokl - oligo complex in presence of 10 mM MgCl 2 . 
Lanes: 1, protein standards; 2, Fokl endonuclease; 
3 and 6, reaction mixture of the tryptic digests of 
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Fokl - oligo complex; 4 and 7, 25 kDa C-terminal 
fragment in the breakthrough volume; 5 and 8, 
tryptic fragments of Fokl that bound to the DEAE 
column. 

5 FIGURE 7 shows an analysis of sequence - 

specific binding of DNA by 41 kDa N-terminal 
fragment using gel mobility shift assays. For the 
exchange reaction, the complex (10 fil) was incubated 
with 1 Ml of 32 P-labeled specific (or non-specific) 

10 oligonucleotide duplex in a volume of 20 til 

containing 10 mM Tris.HCl, 50 mM NaCl and 10 mM MgCl 2 
at 37°C for various times. 1 jil of the 5*- 32 P- 
labeled specific probe (d-5 1 -CCTCTGGATGCTCTC-3 1 (SEQ 
ID NO: 10): 5 1 -GAGAGCATCCAGAGG-3 ' (SEQ ID NO: 11)] 

15 contained 12 picomoles of the duplex and - 50 x 10 3 
cpm. 1/il of the 5 l - 32 P-labeled non-specific probe 
[ 5 • -TAATTG ATTCTTAA- 3 ' ( SEQ ID NO: 12):5 f - 
ATTAAGAATCAATT-3 1 (SEQ ID NO: 13)] contained 12 
picomoles of the duplex and - 25 x 10 3 cpm. (A) 

20 Lanes: 1, specific oligonucleotide duplex; 2, 41 
kDa N-terminal fragment-oligo complex; 3 and 4, 
specific probe incubated with the complex for 30 and 
120 min respectively. (B) Lanes: 1, non-specific 
oligonucleotide duplex; 2, 41 kDa N-terminal 

25 fragment-oligo complex; 3 and 4 non-specific probe 
incubated with the complex for 30 and 120 min 
respectively . 

FIGURE 8 shows SDS (0.1%) polyacrylamide 
(12%) gel electrophoretic profiles of tryptic 

30 fragments at various time points of trypsin 

digestion of Fokl endonuclease. The enzyme (200 /*g) 
in a final volume of 200 nl containing 10 mM 
Tris.HCl, 50 mM NaCl and lOmM MgCl 2 was digested with 
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trypsin at RT. The trypsin to FoJrl ratio was 1:50 
by weight. Aliquots (28 Ml) from the reaction 
mixture removed at different time intervals and 
quenched with excess antipain. Lanes: 1, protein 
5 standards; 2, Fokl endonuclease ; 3, 2.5 min; 4, 5.0 
min; 5, 10 min; 6, 20 min; 7, 40 min; 8, 80 min; and 
9,160 min of trypsin digestion respectively. 

FIGURE 9 shows the tryptic map of Fokl 
endonuclease (A) FoJcI endonuclease fragmentation 

10 pattern in absence of the oligonucleotide substrate. 
(B) Fokl endonuclease fragmentation pattern in 
presence of the oligonucleotide substrate. 

FIGURE 10 shows the predicted secondary 
structure of FoJcI based on its primary sequencing 

15 using the PREDICT program. (See SEQ ID NO: 31) The 
trypsin cleavage site of Fokl in the presence of DKA 
substrates is indicated by the arrow. The 
KSELEEKKSEL segment is highlighted. The symbols are 
as follows: h, helix; s, sheet; and # , random coil. 

20 FIGURE 11 shows the sequences of the 5 1 

and 3 1 oligonucleotide primers used to construct the 
insertion mutants of Fokl (see SEQ ID NO: 32, SEQ ID 
N0:33 f SEQ ID NO:34, SEQ ID N0:35, SEQ ID N0:36, SEQ 
ID NO: 37, SEQ ID NO: 38 and SEQ ID NO: 39, 

25 respectively) . The four and seven codon inserts are 
shown in bold letters. The amino acid sequence is 
indicated over the nucleotide sequence. The same 3' 
primer was used in the PCR amplification of both 
insertion mutants. 

30 FIGURE 12 shows the SDS/PAGE profiles of 

the mutant enzymes purified to homogeneity. Lanes: 
1, protein standards; 2, FoJcI; 3, mutant Fokl with 
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4-codon insertion; and 4, mutant Fokl with 7-codon 
insertion. 

FIGURE 13 shows an analysis of the DNA 
sequence specificity of the mutant enzymes* The DNA 
5 substrates were digested in 10 mM Tris HC1, pH 

8.0/50 mM NaCl/1 mM DTT/lOmM MgCl 2 at 37°C for 2 
hrs. 

(A) Cleavage pattern of pTZ19R DNA 
substrate analyzed by 1% agarose gel 

10 electrophoresis. 2/ig of pTZ19R DNA was used in each 
reaction. Lanes: 1, l-kilobase (kb) ladder; 2, 
PTZ19R; 3, pTZ19R digested with FoJcI; pTZ19R 
digested with mutant Fokl with 4-codon insertion; 
and 5, pTZ19R digested with mutant Fokl with 7-codon 

15 insertion. 

(B) Cleavage pattern of 256 bp DNA 
substrate containing a single Fokl site analyzed by 
1.5% agarose gel electrophoresis, l/xg of 
radiolabeled substrates ( 32 P-labeled on individual 

20 strands) was digested as described above. The 

agarose gel was stained with ethidium bromide and 
visualized under UV light. Lanes 2 to 6 correspond 
to the 32 P-labeled substrate in which the 5 , -CATCC-3 f 
strand is 32 ~ p labeled. Lanes 7 to 11 correspond to 

25 the substrate in which the 5 • -GGATG-3 1 strand is ^P- 
labeled. Lanes: 1, lkb ladder; 2 and 7 f 32 P-labeled 
250 bp DNA substrates; 3 and 8, 32 -P labeled 
substrates cleaved with Fokl; 4 and 9, purified the 
laboratory wild-type Fokl; 5 and 10, mutant Fokl 

30 with 4-codon insertion; 6 and 11 , mutant FoJtl with 
7-codon insertion. 

(C) Autoradiograph of the agarose gel 
from above. Lanes: 2 to 11, same as in B. 
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FIGURE 14 shows an analysis of the 
distance of cleavage from the recognition site by 
Fokl and the mutant enzymes. The unphosphorylated 
oligonucleotides were used for dideoxy DNA 
5 sequencing with pTZ19R as the template. The 
sequencing products (G, A, T, C) were 
electrophoresed on a 6% acrylamide gel containing 7M 
urea, and the gel dried. The products were then 
exposed to an x-ray film for 2 hrs. Cleavage 

10 products from the 100 bp and the 256 bp DNA 

substrates are shown in A and B, respectively. I 
corresponds to substrates containing 32 P- label on the 
5 1 -GGATG-3 * strand, and II corresponds to substrates 
containing 32 P-label on the S'-CATCC-a 1 strand. 

15 Lanes: 1, Fokl; 2, Fokl ; 3, mutant Fokl with 4- 
codon insertion; and 4, mutant Fokl with 7-codon 
insertion. 

FIGURE 15 shows a map of the cleavage 
site(s) of Fokl and the mutant enzymes based on the 

20 100 bp DNA substrate containing a single Fokl site: 
(A) wild-type Fokl; (B) mutant Fokl with 4-codon 
insertion; and (C) mutant Fokl with 7-codon 
insertion (see SEQ ID NO: 40) • The sites of cleavage 
are indicated by the arrows. Major cleavage sites 

25 are shown by larger arrows. 

DETAILED DESCRIPTION OF THE INVENTION 
The present invention is based on the 
identification and characterization of the 
functional domains of the Fokl restriction 

30 endonuclease. In the experiments resulting in the 
present invention, it was discovered that the Fokl 
restriction endonuclease is a two domain system, one 
domain of which possesses the sequence-specific 
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recognition activity while the other domain contains 
the nuclease cleavage activity. 

The Fokl restriction endonuclease 
recognizes the non-palindromic pentanucleotide 5»- 
5 GGATG-3 1 (SEQ ID NO: 1) :5 » -CATCC-3 • (SEQ ID NO: 2) in 
duplex DNA and cleaves 9/13 nucleotides downstream 
from the recognition site. Since 10 base pairs are 
required for one turn of the DNA helix, the present 
inventors hypothesized that the enzyme would 

10 interact with one face of the DNA by binding at one 
point and cleave at another point on the next turn 
of the helix. This suggested the presence of two 
separate protein domains, one for sequence-specific 
recognition of DNA and one for endonuclease 

15 activity. The hypothesized two domain structure was 
shown to be the correct structure of the Fokl 
endonuclease system by studies that resulted in the 
present invention. 

Accordingly, in one embodiment, the 

20 present invention relates to a DNA segment which 
encodes the N-terminus of the Fokl restriction 
endonuclease (preferably, about the N-terminal 2/3«s 
of the protein) . This DNA segment encodes a protein 
which has the sequence-specific recognition activity 

25 of the endonuclease, that is, the encoded protein 

recognizes the non-palindromic pentanucleotide d-5 f - 
GGATG-3 1 (SEQ ID NO: 1) : 5 ■ -CATCC-3 ■ (SEQ ID NO: 2) in 
duplex DNA. Preferably, the DNA segment of the 
present invention encodes amino acids 1-382 of the 

30 Fokl endonuclease. 

In a further embodiment, the present 
invention relates to a DNA segment which encodes the 
C-terminus of the Fokl restriction endonuclease. 
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The protein encoded by this DNA segment of the 
present invention has the nuclease cleavage activity 
of the Fokl restriction endonuclease. Preferably, 
the DNA segment of the present invention encodes 
5 amino acids 383-578 of the Fokl endonuclease. DNA 
segments of the present invention can be readily 
isolated from a biological samples using methods 
known in the art, for example, gel electrophoresis, 
affinity chromatography, polymerase chain reaction 

10 (PCR) or a combination thereof. Further, the DNA 

segments of the present invention can be chemically 
synthesized using standard methods in the art. 

The present invention also relates to the 
proteins encoded by the DNA segments of the present 

15 invention. Thus, in another embodiment, the present 
invention relates to a protein consisting 
essentially of the N-terminus of the Fokl 
endonuclease which retains the sequence-specific 
recognition activity of the enzyme. This protein of 

20 the present invention has a molecular weight of 
about 41 kilodaltons as determined by SDS 
polyacrylamide gel electrophoresis in the presence 
of 2-mercaptoethanol. 

In a further embodiment, the present 

25 invention relates to a protein consisting 
essentially of the C-terminus of the Fokl 
restriction endonuclease (preferably, the C-terminal 
1/3 of the protein). The molecular weight of this 
protein is about 25 kilodaltons as determined by SDS 

30 polyacrylamide gel electrophoresis in the presence 
of 2-mercaptoethanol. 

The proteins of the present invention can 
be isolated or purified from a biological sample 
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using methods known in the art. For example, the 
proteins can be obtained by isolating and cleaving 
the Fokl restriction endonuclease. Alternatively, 
the proteins of the present invention can be 
5 chemically synthesized or produced using recombinant 
DNA technology and purified. 

The DNA segments of the present invention 
can be used to generate •hybrid* restriction enzymes 
by linking other DNA binding protein domains with 

10 the nuclease domain of Fokl. This can be achieved 

chemically as well as by recombinant DNA technology. 
Such chimeric enzymes are useful for physical 
mapping and sequencing of genomes of various 
species, such as, humans, mice and plants. For 

15 example, such enzymes would be suitable for use in 
mapping the human genome. 

Such chimeric enzymes are also valuable 
research tools in recombinant DNA technology and 
molecular biology. Currently only 4-6 base pair 

20 cutters and a few 8 base pair cutters are available 
commercially. (There are about 10 endonucleases 
which cut >6 base pairs that are available 
commercially.) By linking other DNA binding 
proteins to the nuclease domain of Fokl new enzymes 

25 can be generated that recognize more than 6 base 
pairs in DNA. 

Accordingly, in a further embodiment, the 
present invention relates to a DNA construct and 
the hybrid restriction enzyme encoded therein. The 

30 DNA construct of the present invention comprises a 
first DNA segment encoding the nuclease domain of 
the Fokl restriction endonuclease, a second DNA 
segment encoding a sequence-specific recognition 
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domain and a vector. The first DNA segment and the 
second DNA segment are operably linked to the vector 
so that expression of the segments can be effected 
thereby yielding a chimeric restriction enzyme. The 
5 construct can comprise regulatory elements such as 
promoters (for example r T7, tac, trp and lac UV5 
promoters) , transcriptional terminators or 
retroregulators (for example, stem loops) . Host 
cells (procaryotes such as E. coli) can be 

10 transformed with the DNA constructs of the present 
invention and used for the production of chimeric 
restriction enzymes. 

The hybrid enzymes of the present 
invention comprise the nuclease domain of Fokl 

15 linked to a recognition domain of another enzyme or 
DNA binding protein (such as, naturally occurring 
DNA binding proteins that recognize 6 base pairs) • 
Suitable recognition domains include, but are not 
limited to, the recognition domains of zinc finger 

20 motifs; homeo domain motifs; other DNA binding 

protein domains of lambda repressor, lac repressor, 
cro, ga!4; DNA binding protein domains of oncogenes 
such as myc, jun; and other naturally occurring 
sequence-specific DNA binding proteins that 

25 recognize >6 base pairs. 

The hybrid restriction enzymes of the 
present invention can be produced by those skilled 
in the art using known methodology. For example, 
the enzymes can be chemically synthesized or 

30 produced using recombinant DNA technology well known 
in the art. The hybrid enzymes of the present 
invention can be produced by culturing host cells 
(such as, HB101, RR1, RB791 and MM294) containing 
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the DNA construct of the present invention and 
isolating the protein. Further, the hybrid enzymes 
can be chemically synthesized, for example, by 
linking the nuclease domain of the FoJtl to the 
5 recognition domain using common linkage methods 

known in the art, for example, using protein cross- 
linking agents such as EDC/NHS, DSP, etc. 

While the Fokl restriction endonuclease 
was the enzyme studied in the following experiments, 

10 it is expected that other Type IIS endonucleases 
(such as, those listed in Table 2) will function 
using a similar two domain structure which one 
skilled in the art could readily determine based on 
the present invention. 

15 Recently, StsI, a heteroschizomer of Fokl 

has been isolated from Streptococcus sanguis (Kita 
et al., FMPlejp fo<?X$$ Re^rch 20 (3)) 618, 1992). 
StsI recognizes the same nonpalindromic 
pentadeoxyr ibonucleotide 5 1 -GGATG-3 9 : 5 1 -CATCC-3 1 as 

20 Fokl but cleaves 10/14 nucleotides downstream of the 
recognition site. The StsI RM system has been 
cloned and sequenced (Kita et al., Nucleic Acids 
Research 20 (16) 4167-72, 1992). Considerable amino 
acid sequence homology (-30%) has been detected 

25 between the endonucleases, Fokl and StsI. 

Another embodiment of the invention 
relates to the construction of two insertion mutants 
of Fokl endonuclease using the polymerase chain 
reaction (PCR) . In particular, this embodiment 

30 includes a DNA construct comprising a first DNA 

segment encoding the catalytic domain of a Type IIS 
endonuclease which contains the cleavage activity of 
the Type IIS endonuclease, a second DNA segment 
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encoding a sequence-specific recognition domain 
other than the recognition domain of the Type IIS 
endonuclease, and a third DNA segment comprising one 
or more codons. The third DNA segment is inserted 
5 between the first DNA segment and the second DNA 
segment. The construct also includes a vector. 
The Type IIS endonuclease is Fokl restriction 
endonuclease. 

Suitable recognition domains include, but 

10 are not limited to, zinc finger motifs, homeo domain 
motifs, DNA binding domains of repressors, DNA 
binding domains of oncogenes and naturally occurring 
sequence-specific DNA binding proteins that 
recognize >6 base pairs. 

15 As noted above, the recognition domain of 

Fokl restriction endonuclease is at the amino 
terminus of Fokl endonuclease, whereas the cleavage 
domain is probably at the carboxyl terminal third of 
the molecule. It is likely that the domains are 

20 connected by a linker region, which defines the 
spacing between the recognition and the cleavage 
sites of the DNA substrate. This linker region of 
Fokl is susceptible to cleavage by trypsin in the 
presence of a DNA substrate yielding a 41-kDa amino- 

25 terminal fragment (The DNA binding domain) and a 25- 
kDa carboxyl-terminal fragment (the cleavage 
domain) . Secondary structure prediction of Fokl 
endonuclease based on its primary amino acid 
sequence supports this hypothesis (see Figure 10) . 

30 The predicted structure reveals a long stretch of 
alpha helix region at the junction of the 
recognition and cleavage domains. This helix 
probably constitutes the linker which connects the 
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two domains of the enzyme. Thus, it was thought 
that the cleavage distance of Fokl from the 
recognition site could be altered by changing the 
length of this spacer (the alpha helix). Since 3.6 
5 amino acids are required to form one turn of the 
alpha helix, insertion of either four codons or 
seven codons in this region would extend the pre- 
existing helix in the native enzyme by one or two 
turns , respectively. Close examination of the amino 

10 acid sequence of this helix region revealed the 
presence of two KSEL repeats separated by amino 
acids EEK (Figure 10) (see SEQ ID NO:21) . The 
segments KSEL (4 codons) (see SEQ ID NO: 22) and 
KSELEEK (7 codons) (see SEQ ID NO: 23) appeared to be 

15 good choices for insertion within this helix in 
order to extend it by one and two turns, 
respectively. (See Examples X and XI.) Thus, 
genetic engineering was utilized in order to create 
mutant enzymes. 

20 In particular, the mutants are obtained by 

inserting one or more, and preferably four or seven, 
codons between the recognition and cleavage domains 
of Fo/cl. More specifically, the four or seven 
codons are inserted at nucleotide 1152 of the gene 

25 encoding the endonuclease. The mutants have the 
same DNA sequence specificity as the wild-type 
enzyme. However, they cleave one nucleotide further 
away from the recognition site on both strands of 
the DNA substrates as compared to the wild-type 

30 enzyme. 

Analysis of the cut sites of Fokl and the 
mutants, based on the cleavage of the 100 bp 
fragment, is summarized in Figure 15. Insertion of 
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four (or seven) codons between the recognition and 
cleavage domains of Fokl is accompanied by an 
increase in the distance of cleavage from the 
recognition site. This information further supports 
5 the presence of two separate protein domains within 
the Fokl endonuclease: one for the sequence 
specific recognition and the other for the 
endonuclease activity. The two domains are 
connected by a linker region which defines the 

10 spacing between the recognition and the cleavage 

sites of the DNA substrate. The modular structure 
of the enzyme suggests it may be feasible to 
construct chimeric endonucleases of different 
sequence specificity by linking other DNA-binding 

15 proteins to the cleavage domain of the Fokl 
endonuclease. 

In view of the above-information, another 
embodiment of the invention includes a procaryotic 
cell comprising a first DNA segment encoding the 

20 catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of the Type IIS 
endonuclease , a second DNA segment encoding a 
sequence-specific recognition domain other than the 
recognition domain of the Type IIS endonuclease, and 

25 a third DNA segment comprising one or more codons. 

The third DNA segment is inserted between the first 
DNA segment and the second DNA segment. The cell 
also includes a vector. Additionally, it should be 
noted that the first DNA segment and the second DNA 

30 segment are operably linked to the vector so that a 
single protein is produced* The third segment may 
consist essentially of four or seven codons. 
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The present invention also includes the 
protein produced by the procaryotic cell referred to 
directly above. In particular, the isolated protein 
consists essentially of the recognition domain of 
5 the Fokl restriction endonuclease, the catalytic 
domain of the Fokl restriction endonuclease, and 
amino acids encoded by the codons present in the 
third DNA segment. 

The following non-limiting Examples are 
10 provided to describe the present invention in 
greater detail. 

The following materials and methods were 
utilized in the isolation and characterization of 

15 the Fokl restriction endonuclease functional domains 
as exemplified here inbe low. 

Bacterial strains and plasmids 
Recombinant plasmids were transformed into 
E.coli RB791 i<? cells which carry the lac allele 

20 on the chromosome (Brent and Ptashne, PNAS USA , 

78:4204-4208, 1981) or E.coli RRl cells. Plasmid 
pACYC fokIM is a derivative of pACYC184 carrying the 
PCR-generated fokIM gene inserted into NcoJ site. 
The plasmid expresses the Fokl methylase 

25 const itutively and was present in RB791 cells (or 

RRl cells) whenever the fokIR gene was introduced on 
a separate compatible plasmid. The Fokl methylase 
modifies Fokl sites and provides protection against 
chromosomal cleavage. The construction of vectors 

30 pRRS and pCB are described elsewhere (Skoglund et 
al., Gene , 88:1-5, 1990). 



21 



Enzvmes. biochemicals and oliaos 
Oligo primers for PCR were synthesized 
with an Applied Biosystem DNA synthesizer using 
cyanoethyl phosphoramidite chemistry and purified by 
reversed phase HPLC. Restriction enzymes were 
purchased from New England Biolabs. The DNA ligase 
IPTG were from Boehringer-Mannheim. PCR reagents 
were purchased as a Gene Amp Kit from Perkin-Elmer. 
Plasmid purification kit was from QIAGEN. 

Restriction enzyme assays 
Cells from a 5-ml sample of culture medium 
were harvested by centrifugation, resuspended in 0.5 
ml sonication buffer [50 mM Tris.HCl (pH 8) , 14mM 2- 
mercaptoethanol] , and disrupted by sonication (3x5 
seconds each) on ice. The cellular debris was 
centrifuged and the crude extract used in the enzyme 
assay. Reaction mixtures (10 fxl) contained lOmM 
Tris.HCl (pH 8) , 10 mM MgCl2, 7 mM 2-mercaptoethanol, 
50 Mg of BSA, 1 |ig of plasmid pTZ19R (U.S. 
biochemicals) and 1/xl of crude enzyme. Incubation 
was at 37°C for 15 min. tRNA (10 pq) was added to 
the reaction mixtures when necessary to inhibit non- 
specific nucleases. After digestion, 
1 /il of dye solution (100 mM EDTA, 0.1% bromophenol 
blue, 0.1% xylene cyanol, 50% glycerol) was added, 
and the samples were electrophoresed on a 1% agarose 
gel. Bands were stained with 0.5 Mg ethidium 
bromide/ml and visualized with 310-nm ultraviolet 
light. 

?Pg/PAGE 

Proteins were prepared in sample buffer 
and electrophoresed in SDS (0.1%)- polyacrylamide 
(12%) gels as described by Laemmli (Laemmli, Nature . 
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222: 680-685 , 1970). Proteins were stained with 
coomassie blue. 

Example I 
Cloning of Fokl RM system 
5 The Fo&I system was cloned by selecting 

for the modification phenotype. Flavobacterium 
okeanokoites strain DNA was isolated by the method 
described by Caserta et al. (Caserta et al., J. 
Biol. Chem. . 262:4770-4777, 1987). Several 

10 Flavobacterium o/ceanoicoites genome libraries were 
constructed in plasmids pBR322 and pUC13 using the 
cloning enzymes PstJ, BajnHI and Bgrlll. Plasmid 
library DNA (10 /xg) was digested with 100 units of 
Folcl endonuclease to select for plasmids expressing 

15 fokIM+ phenotype. 

Surviving plasmids were transformed into 
RR1 cells and transf ormants were selected on plates 
containing appropriate antibiotic. After two rounds 
of biochemical enrichment, several plasmids 

20 expressing the fokIM+ phenotype from these libraries 
were identified. Plasmids from these clones were 
totally resistant to digestion by Fokl. 

Among eight transf ormants that were 
analyzed from the F. okeanokoites pBR322 PstJ 

25 library, two appeared to carry the fokIM gene and 
plasmids from these contained a 5.5 kb PstJ 
fragment. Among eight transf ormants that were 
picked from F. o/ceauo/coites pBR322 BamHI library, 
two appeared to carry the fokIM gene and their 

30 plasmids contained - 18 kb BamHI fragment. Among 
eight transf ormants that were analyzed from the F. 
okeanokoites genome Bgl II library in pUC13, six 
appeared to carry the foklH gene. Three of these 
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clones had a 8 kb Bglll insert while the rest 
contained a 16 kb Bglll fragment. 

Plating efficiency of phage x on these 
clones suggested that they also carried the fokIR 
5 gene. The clones with the 8-kb Bglll insert 

appeared to be most resistant to phage infection. 
Furthermore, the Fokl endonuclease activity was 
detected in the crude extract of this clone after 
partial purification on a phosphocellulose column. 

10 The plasmid, pVCfokIRM from this clone was chosen 
for further characterization. 

The 5.5 kb PstJ fragment was transferred 
to M13 phages and the nucleotide sequences of parts 
of this insert determined using Sanger's sequencing 

15 method (Sanger et al., PNAS USA , 74:5463-5467, 

1977) . The complete nucleotide sequence of the FoJtl 
RM system has been published by other laboratories 
(Looney et al., Gene . 80:193-208, 1989; Kita et al., 
Nucleic Acid Res. . 17: 8741-8753 , 1989; Kita et al., 

20 J. Biol. Chem. 264 :5751-g75fi f 1989). 

Example TI 

Construction of an efficient overproducer clone of 
FoJcI endonuclease using po lymerase chain reaction. 
The PCR technique was used to alter 

25 transcriptional and translational signals 

surrounding the fokIR gene so as to achieve 
overexpression in E.coli (Skoglund et al., Gene . 
88:1-5, 1990). The ribosome-binding site preceding 
the fokIR and foklH genes were altered to match the 

30 consensus E. coli signal. 

In the PCR reaction, plasmid pUCfp/cZRM DNA 
linearized with BamHI was used as the template. PCR 
reactions (100 jil) contained 0.25 nmol of each 
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primer, 50 fM of each dNTP, 10 mM Tris.HCl (pH 8*3 
at 25°C), 50 mM KC1, 1.5 mM MgCl 2 0.01% (W/V) 
gelatin, 1 ng of template DNA, 5 units of Taq DNA 
polymerase. The oligo primes used for the 
5 amplification of the fokIR and fokIM genes are shown 
in Figure 1. Reaction mixtures (ran in 
quadruplicate) were overlayed with mineral oil and 
reactions were carried out using Perkin-Elmer-Cetus 
Thermal Cycler. 

10 Initial template denaturation was 

programmed for 2 min. Thereafter, the cycle profile 
was programmed as follows: 2 min at 37 °C 
(annealing) , 5 min at 72°C (extension) , and 1 min at 
94 °c (denaturation). This profile was repeated for 

15 25 cycles and the final 72 °C extension was increased 
to 10 min. The aqueous layers of the reaction 
mixtures were pooled and extracted once with 1:1 
phenol/chloroform and twice with chloroform. The 
DNA was ethanol -precipitated and resuspended in 20 

20 Ml TE buffer [10 mM Tris.HCl, (pH 7.5), 1 mM EDTA] . 
The DNA was then cleaved with appropriate 
restriction enzymes to generate cohesive ends and 
gel-purified. 

The construction of an over-producer clone 

25 was done in two steps. First, the PCR-generated DNA 
containing the fokIM gene was digested with Ncol and 
gel purified. It was then ligated into tfcol-cleaved 
and dephosphorylated pACYC184 and the recombinant 
DNA transfected into E.coli RB791 or RR1 cells 

30 made competent as described by Maniatis et al 

(Maniatis et al., Molecular Cloning. A laboratory 
manual Cold Spring Harbor Laboratory, Cold Spring 
ttorfrQEi NY, 1982). After Tc selection, several 
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clones were picked and plasmid DNA was examined by 
restriction analysis for the presence of fokIM gene 
fragment in correct orientation to the 
chloramphenicol promoter of the vector (see figure 
5 2) . This plasmid expresses Fo*I methylase 

constitutively and then protects the host from 
chromosomal cleavage, when the fokIR gene is 
introduced into this host on a compatible plasmid. 
The plasmid DNA from these clones are therefore 
- 10 resistant to Fokl digestion. 

Second, the PCR-generated fokIR fragment 
was ligated into BamHI -cleaved and dephosphorylated 
high expression vectors pRRS or pCB. pRRS possesses 
a lac UV5 promoter and pCB containing the strong tac 

15 promoter. In addition, these vectors contain the 
positive retroregulator stem-loop sequence derived 
from the crystal protein-encoding gene of Bacillus 
Thuringiensis downstream of the inserted fokIR gene. 
The recombinant DNA was transfected into competent 

20 E.coli RB791 [pACYCfo/cJJf ] or 

RRl[pACYCfoJcJM] cells. After Tc and Ap antibiotic 
selection, several clones were picked and plasmid 
DNA was examined by restriction analysis for fokIR 
gene fragment in correct orientation for expression 

25 from the vector promoters. These constructs were 
then examined for enzyme production. 

To produce the enzyme, plasmid-containing 
RB791 or RR1 cells were grown at 37 °C with 
shaking in 2x concentrated TY medium [1.6% tryptone, 

30 1% yeast extract, 0.5% NaCl (pH 7.2)] supplemented 
with 20 ng Tc/ml (except for the pUCfokIRM plasmid) 
and 50 /ig Ap/ml. IPTG was added to a concentration 
of 1 mM when the cell density reached O.D.6oo- 0.8. 
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The cells were incubated overnight (12 hr) with 
shaking. As is shown in Figure 2, both constructs 
yield Fokl to a level of 5-8% of the total cellular 
protein. 

Examples III 
Purification of FoKl endonuclease 
A simple three-step purif ication procedure 
was used to obtain electrophoretically homogeneous 
Fokl endonuclease. RR1 [pACYCfoJcIM, pRRSfokIR] were 
grown in 6L of 2 x TY containing 20/*g Tc/ml and 50 
yg/Ap ml at 37 °C to A$oo = 0.8. and then induced 
overnight with 1 mM IPTG. The cells were harvested 
by centrifugation and then resuspended in 250 ml of 
buffer A [10 mM Tr is . phosphate (pH 8.0), 7 mM 2- 
mercaptoethanol, 1 mM EDTA, 10% glycerol] containing 
50 mM NaCl. 

The cells were disrupted at maximum 
intensity on a Branson Sonicator for 1 hr at 4°C. 
The sonicated cells were centrifuged at 12,000 g for 
2 hr at 4°C. The supernatant was then diluted to 1L 
with buffer A containing 50 mM NaCl. The 
supernatant was loaded onto a 10 ml phosphocellulose 
(Whatman) column pre-equilibrated with buffer A 
containing 50 mM NaCl; The column was washed with 
50 ml of loading buffer and the protein was eluted 
with a 80-ml total gradient of 0.05M to 0.5M NaCl in 
buffer A. The fractions were monitored by A 28 o 
absorption and analyzed by electrophoresis on SDS 
(0.1%)-polyacrylamide (12%) gels (Laemmli, Nature . 
222:680-685, 1970). Proteins were stained with 
coomassie blue. 

Restriction endonuclease activity of the 
fractions were assayed using pTZ19R as substrate. 
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The fractions containing FoJcI were pooled and 
fractionated with ammonium sulfate. The 50-70% 
ammonium sulfate fraction contained the Fokl 
endonuclease. The precipitate was resuspended in 50 
5 ml of buffer A containing 25 mM NaCl and loaded onto 
a DEAE column. Fokl does not bind to DEAE while 
many contaminating proteins do. The flow- through 
was concentrated on a phosphocellulose column. 
Further purification was achieved using gel 

10 filtration (AcA 44) column. The Fokl was purified 
to electrophoretic homogeneity using this procedure. 

SDS (0.1%) polyacrylamide (12%) gel 
electrophoresis profiles of protein species present 
at each stage of purification are shown in Figure 3. 

15 The sequence of the first ten amino acids of the 
purified enzyme was determined by protein 
sequencing. The determined sequence was the same as 
that predicted from the nucleotide sequence. 
Crystals of this purified enzyme have also been 

20 grown using PEG 4000 as the precipitant. Fokl 

endonuclease was purified further using AcA44 gel 
filtration column. 

Example IV 

Analysis of FokIR endonuclease bv trypsin cleavage 

25 in the presence of DNA substrate. 

Trypsin is a serine protease and it 
cleaves at the C-terminal side of lysine and 
arginine residues. This is a very useful enzyme to 
study the domain structure of proteins and enzymes. 
30 Trypsin digestion of Fokl in the presence of its 

substrate, d-5 1 -CCTCTGGATGCTCTC-3 1 (SEQ ID NO: 10): 
5 1 — GAGAGCATCCAGAGG-3 1 (SEQ ID NO: 11) was carried 
out with an oligonucleotide duplex to Fokl molar 
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the oligonucleotide duplex in a volume 180 pi 
containing 10 mM Tris.HCl, 50 mM NaCl, 10% glycerol 
and 10 mM MgCl 2 at RT for 1 hr. Trypsin (20 pi, 0.2 
5 mg/ml) was added to the mixture. Aliquot s (28 fil) 
from the reaction mixture were removed at different 
time intervals and quenched with excess trypsin 
inhibitor, antipain. The tryptic fragments were 
purified by reversed-phase HPLC and their N- terminus 

10 sequence determined using an automatic protein 
sequenator from Applied Biosystems. 

The time course of trypsin digestion of 
Fokl endonuclease in the presence of 2.5 molar 
excess of oligonucleotide substrate and 10 mM MgCl* 

15 is shown in Figure 4. At the 2.5 min time point 

only two major fragments other than the intact Fokl 
were present, a 41 kDa fragment and a 25 kDa 
fragment. Upon further trypsin digestion, the 41 
kDa fragment degraded into a 30 kDa fragment and 11 

20 kDA fragment. The 25 kDa fragment appeared to be 
resistant to any further trypsin digestion. This 
fragment appeared to be less stable if the trypsin 
digestion of FoJcl - oligo complex was carried out in 
the absence of MgCl2. 

25 Only three major fragments (30 kDa, 25 kDa 

and 11 kDa) were present at the 160 min time point. 
Each of these fragments (41 kDa, 30 kDa, 25 kDa and 
11 kDa) was purif ied by reversed-phase HPLC and 
their N-terminal amino acid sequence were determined 

30 (Table I) . By comparing these N-terminal sequences 
to the predicted sequence of Fokl, the 41 kDa and 25 
kDa fragments were identified as N-terminal and C- 
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terminal fragments, respectively. In addition, the 
30 kDa fragment was N-terminal. 

Example V 

leQlciUon 9f PNA biding trypUc fragments of FqKI 

5 endonuclease using oliao dT-cellulose affinity 
column . 

The DNA binding properties of the tryptic 
fragments were analyzed using an oligo dT-cellulose 
column. Fokl (160 jig) was incubated with the 2.5 

10 molar excess oligonucleotide duplex [d-5 1 - 
CCTCTGGATGCTCTC ( A) 15— 3 ' (SEQ ID NO: 14): 
5 1 GAGAGCATCCAGAGG (A) 15-3 • (SEQ ID NO: 15)] in a 
volume of 90 /*1 containing 10 mM Tris.HCl (pH 8) , 50 
mM NaCl, 10% glycerol and 10 mM MgCl 2 at RT for 1 hr. 

15 Trypsin (10 /il, 0.2 mg/ml) was added to the solution 
to initiate digestion. The ratio of trypsin to Fokl 
(by weight) was 1:80. Digestion was carried out 
for 10 min to obtain predominantly 41 kDa N-terminal 
fragment and 25 kDa C-terminal fragments in the 

20 reaction mixture. The reaction was quenched with 
large excess of antipain (10 fig) and diluted in 
loading buffer [10 mM.Tris HC1 (pH 8.0), 1 mM EDTA 
and 100 mM MgCl2] to a final volume of 400 pi. 

The solution was loaded onto a oligo dT- 

25 cellulose column (0.5 ml, Sigma, catalog #0-7751) 
pre-eguilibrated with the loading buffer. The 
breakthrough was passed over the oligo dT-cellulose 
column six times. The column was washed with 5 ml 
of loading buffer and then eluted twice with 0.4 ml 

30 of 10 mM Tris.HCl (pH 8.0), 1 mM EDTA. These 

fractions contained the tryptic fragments that were 
bound to the oligonucleotide DNA substrate. The 
tryptic fragment bound to the oligo dT-cellulose 
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column was analyzed by SDS-polyacrylamide gel 
electrophoresis . 

In a separate reaction, the trypsin 
digestion was carried out for 160 min to obtain 
5 predominantly the 30 kDa, 25 kDa and 11 kDa 
fragments in the reaction mixture. 

Trypsin digestion of FoJcI endonuclease for 
10 min yielded the 41 kDa N-terminal fragment and 25 
kDa C-terminal fragments as the predominant species 

10 in the reaction mixture (Figure 5, Lane 3). When 

this mixture was passed over the oligo dT-cellulose 
column, only the 41 kDa N-terminal fragment is 
retained by the column suggesting that the DNA 
binding property of Fokl endonuclease is in the N- 

15 terminal 2/3*s of the enzyme. The 25 kDa fragment 
is not retained by the oligo dT-cellulose column. 

Trypsin digestion of Fokl - oligo complex 
for 160 min yielded predominantly the 30 kDa, 25 kDa 
and 11 kDa fragments (Figure 5, Lane 5). When this 

20 reaction mixture was passed over oligo dT-cellulose 
column, only the 30 kDa and 11 kDa fragments were 
retained. It appears these species together bind 
DNA and they arise from further degradation of 41 
kDa N-terminal fragment. The 25 kDa fragment was 

25 not retained by oligo dT-cellulose column. It also 
did not bind to DEAE and thus could be purified by 
passage through a DEAE column and recovering it in 
the breakthrough volume. 

Fokl (390 jig) was incubated with 2.5 molar 

30 excess of oligonucleotide duplex [d-5'- 
CTCTGGATG CTCTC - 3 (SEQ ID NO: 10) , :5 t - 
GAGAGCATCCAGAGG-3 1 (SEQ ID NO: 11)] in a total volume 
of 170 jil containing 10 mM Tris.HCl (pH 8) , 50 mM 
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NaCl and 10% glycerol at RT for 1 hr. Digestion 
with trypsin (30 pi; 0.2 mg/ml) in the absence of 
MgCl 2 was for 10 min at RT to maximize the yield of 
the 41 kDa N-terminal fragment. The reaction was 
5 quenched with excess antipain (200 pi) . The tryptic 
digest was passed through a DEAE column. The 25 kDa 
of C-terminal fragment was recovered in the 
breakthrough volume. All the other tryptic 
fragments (41 kDa, 30 kDa and 11 kDa) were retained 

10 by the column and were eluted with 0.5M NaCl buffer 
(3 x 200 pi) . In a separate experiment, the trypsin 
digestion of Fokl -oligo complex was done in 
presence of 10 mM MgCl2 at RT for 60 min to maximize 
the yield of 30 kDa and 11 kDa fragments. This 

15 purified fragment cleaved non-specif ically both 

unmethylated DNA substrate (pTZ19R; Figure 6) and 
methylated DNA substrate (pACYCfoJtJM) in the 
presence of MgCl 2 . These products are small, 
indicating that it is relatively non-specific in 

20 cleavage. The products were dephosphorylated using 
calf intestinal phosphatase and rephosphorylated 
using polynucleotide kinase and [ Y - 32 P] ATP. The 
32 P-labeled products were digested to 
mononucleotides using DNase I and snake venom 

25 phosphodiesterase. Analysis of the mononucleotides 
by PEI -cellulose chromatography indicates that the 
25 kDa fragment cleaved preferentially 
phosphodiester bonds 5 1 to G>A»T-C. The 25 kDa C- 
terminal fragment thus constitutes the cleavage 

30 domain of Fokl endonuclease. 

The 41 kDa N-terminal fragment - oligo 
complex was purified by agarose gel electrophoresis. 
Fokl endonuclease (200 pg) was incubated with 2.5 
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molar excess of oligonucleotide duplex, [d-5» - 
CCTCTGGATGCTCTC-3 9 (SEQ ID NO: 10): 5 1 - 
GAGAGCATCCAGAGG-3 1 (SEQ ID NO: 11)] in a volume of 180 
Ml containing 10 mM Tris.HCl (pH 8.0), 50 mM NaCl 
5 and 10% glycerol at RT for 1 hr. Tracer amounts of 
3 2 P- labeled oligonucleotide duplex was incorporated 
into the complex to monitor it during gel 
electrophoresis. Digestion with trypsin (20 pi; 0.2 
mg/ml) was for 12 min at RT to maximize the yield of 

10 the 41 kDa N-terminal fragment. The reaction was 
quenched with excess antipain. The 41 kDa N- 
terminal fragment - oligo complex was purified by 
agarose gel electrophoresis. The band corresponding 
to the complex was excised and recovered by 

15 electroelution in a dialysis bag 

(- 600 Ml) . Analysis of the complex by SDS-PAGE 
revealed 41 kDa N-terminal fragment to be the major 
component. The 30 kDa N-terminal fragment and the 
11 kDa C-terminal fragment were present as minor 

20 components. These together appeared to bind DNA and 
co-migrate with the 41 kDa N-terminal fragment-oligo 
complex. 

The binding specificity of the 41 KDa N- 
terminal fragment was determined using gel mobility 
25 shift assays. 

Example VI 
Gel Mobility shift assays 
The specific oligos (d-5 1 -CCTCTGGATGCTCTC- 
3' (SEQ ID NO: 10) and d-5 1 -GAGAGCATCCAGAGG-3 1 (SEQ 
30 ID NO: 11)) were 5 ■ - 32 P-labeled in a reaction 

mixture of 25 pi containing 40 mM Tris.HCl(pH7.5) , 
20mM MgCl 2 ,50 mM NaCl, 10 mM DTT, 10 units of T4 
polynucleotide kinase (from New England Biolabs) and 
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20 jiCi[ Y - 32 P] ATP (3000 Ci/mmol) . The mixture was 
incubated at 37°C for 30 min. The kinase was 
inactivated by heating the reaction mixture to 70 °C 
for 15 min. After addition of 200 pi of water, the 
5 solution was passed through Sephadex G-25 

(Superfine) column (Pharmacia) to remove the 
unreacted [ Y - 32 P] ATP. The final concentration of 
labeled single-strand oligos were 27 /iM. 

The single-strands were then annealed to 

10 form the duplex in 10 mM Tris.HCl (pH 8.0), 50 mM 
NaCl to a concentration of 12 jiM. 1 Ml of the 
solution contained - 12 picomoles of oligo duplex 
and - 50 x 10 3 cpm. The non-specific oligos (d-5 1 - 
TAATTGATTCTTAA- 3 1 (SEQ ID NO: 12) and d-5'- 

15 ATTAAGAATCAATT-3 f (SEQ ID NO: 13)) were labeled with 
[ Y - 32 P]ATP and polynucleotide kinase as described 
herein. The single-stranded oligos were annealed to 
yield the duplex at a concentration of 12jiM. 1 /il 
of the solution contained - 12 picomoles of oligo 

20 duplex and - 25 x 10 3 cpm. The non-specific oligos 
(d-5 1 -TAATTGATTCTTAA-3 1 (SEQ ID NO: 12) and d-5 f - 
ATTAAGAATCAATT-3 1 (SEQ ID NO: 13)) were labeled with 
[ Y - 32 P] ATP and polynucleotide Kinase as described 
herein. The single-strand oligos were annealed to 

25 yield the duplex at a concentration of 12/iM. 1 /il of 
the solution contained 42 picomdes of oligo duplex 
and -25X10 3 cpm. 

10 pi of 41 kDa N-terminal fragment-oligo 
complex (- 2 pmoles) in 10 mM Tris.HCl, 50 mM NaCl 

30 and 10 mM MgClj was incubated with l /il of 32 P- 
labeled specific oligonucleotide duplex (or 32 p- 
labeled non-specific oligonucleotide duplex) at 37 °C 
for 30 min and 120 min respectively. 5 pi of 75% 
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glycerol was added to each sample and loaded on a 8% 
nondenaturing polyacryl amide gel. Electrophoresis 
was at 300 volts in TBE buffer until bromophenol 
blue moved - 6 cm from the top of the gel. The gel 
5 was dried and autoradiographed. 

The complex readily exchanged 32 P-labeled 
specific oligonucleotide duplex that contained the 
Fokl recognition site as seen from the gel mobility 
shift assays (Figure 7). It did not, however, 

10 exchange the 3 2 P- labeled non-specific 

oligonucleotide duplex that did not contain the Fokl 
recognition site. These results indicate that all . 
the information necessary for sequence-specific 
recognition of DNA are encoded within the 41 kDa N- 

15 terminal fragment of Fokl. 

Example VIE 

Analysis of Fokl by trypsin cleavage in the absence 
of DNA substrate. 

A time course of trypsin digestion of Fokl 

20 endonuclease in the absence of the DNA substrate is 
shown in Figure 8. Initially, Fokl cleaved into a 
58 kDa fragment and a 8 kDa fragment. The 58 kDa 
fragment did not bind DNA substrates and is not 
retained by the oligo dT-cellulose column. On 

25 further digestion, the 58 kDa fragment degraded into 
several intermediate tryptic fragments. However, 
the complete trypsin digestion yielded only 25 kDa 
fragments (appears as two overlapping bands). 

Each of these species (58 kDa, 25 kDa and 

30 8 kDa) were purified by reversed phase HPLC and 

their amino terminal amino acid sequence determined 
(Table I) . Comparison of the N-terminal sequences 
to the predicted Fokl sequence revealed that the 8 
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kDa fragment to be N-terminal and the 58 kDa 
fragment to be Oterminal. This further supports 
the conclusion that N- terminus of Fokl is 
responsible for the recognition domain* Sequencing 
5 the N-terminus of the 25 kDa fragments revealed the 
presence of two different components. A time course 
of trypsin digestion of Fokl endonuclease in a the 
presence of a non-specific DNA substrate yielded a 
profile similar to the one obtained when trypsin 
- 10 digestion of Fokl is carried out in absence of any 
DNA substrate. 

Example VIII 
Cleavage specificity of the 25 kDa C- terminal 
trvptic fragment of Fokl 

15 The 25 kDa C-terminal tryptic fragment of 

Fokl cleaved pTZ19R to small products indicating 
non-specific cleavage. The degradation products 
were dephosphorylated by calf intestinal 'phosphatase 
and 32 P-labeled with the polynucleotide kinase and 

20 [t- 32 P]ATP. The excess label was removed using a 
Sephadex G-25 (Superfine) column. The labeled 
products were then digested with 1 unit of 
pancreatic DNase I (Boehringer-Mannheim) in buffer 
containing 50 mM Tris.HCl (pH7 . 6) , lOmM MgCl 2 at 37*C 

25 for 1 hr. Then, 0.02 units of snake venom 

phosphodiesterase was added to the reaction mixture 
and digested at 37 °C for 1 hr. 

Example IX 

Functional domains in Fokl restriction endonuclease. 
30 Analysis of functional domains of Fokl (in 

the presence and absence of substrates) using 
trypsin was summarized in Figure 9. Binding of DNA 
substrate by Fokl was accompanied by alteration in 
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the structure of the enzyme. This study supports 
that presence of two separate protein domains within 
this enzyme: one for sequence-specific recognition 
and the other for endonuclease activity. The 
5 results indicate that the recognition domain is at 
the N-terminus of the Fokl endonuclease, while the 
cleavage domain is probably in the C-terminus third 
of the molecule. 

Examples Relating to Construction 

10 of Insertion Mutants (X-XIV) 

The complete nucleotide sequence of the 
Fokl RM system has been published by various 
laboratories (Looney et al., Gene 80: 193-208, 1989 
& Kita et al., J. Biol.Chem. 264: 5751-56, 1989). 

15 Experimental protocols for PCR are described, for 
example, in Skoglu-nd et al., Gene 88:1-5, 1990 and 
in Bassing et al., Gene 113:83-88, 1992. The 
procedures for cell growth and purification of the 
mutant enzymes are similar to the ones used for the 

20 wild-type FoJcI (Li et al., Proc. Nat'l. Acad. Sci. 
USA 89:4275-79, 1992). Additional steps which 
include Sephadex G-75 gel filtration and Heparin- 
Sepharose CL-6B column chromatography were necessary 
to purify the mutant enzymes to homogeneity. 

25 Example X 

Mutagensis of 5 pel Site at Nucleotide 162 within the 
fqklR Gene 

The two step PCR technique used to 
mutagenize one of the Spel sites within the fokIR 
30 gene is described in Landt et al., Gene 96: 125-28, 
1990. The three synthetic primers for this protocol 
include: 1) the mutagenic primer ( 5 1 -TCATAA 
TAGCAACTAATTCTTTTTGGATCTT-3 1 ) (see SEQ ID NO: 24) 
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containing one base mismatch within the Spel site; 
2) the other primers each of which are flanked by 
restriction sites Clal ( 5 1 -c catcgata tagccttttttatt- 
3») (see SEQ ID NO: 25) and Xbal (5 1 - 
5 GCTCTAGAGGATCCGGAGGT-3 1 ) (see SEQ ID NO: 2 6), 
respectively. An intermediate fragment was 
amplified using the Xbal primer and the mutagenic 
primer during the first step. The Clal primer was 
then added to the intermediate for the second step 

10 PCR. The final 0.3 kb PCR product was digested with 
Xbal /Clal to generate cohesive ends and gel- 
purified. The expression vector (pRRSfoJcJi*) was 
cleaved with Xbai /Clal. The large 4.2 kb fragment 
was then gel-purified and ligated to the PCR 

15 fragment. The recombinant DNA was transfected into 
competent E. coli RRltpACYCfo/cJtf] cells. After 
tetracycline and ampicillin antibiotic selection 
several clones were picked r and their plasmid DNA 
was examined by restriction analysis. The Spel site 

20 mutation was confirmed by sequencing the plasmid DNA 
using Sanger's sequencing method (Sanger et al. 
Proc. Natl. Acad. Sci. USA 74: 5463-67 , 1977). 

Example XI 

Construction of four for seven) codon Insertion 
25 Mentis 

The PCR-generated DNA containing a four 
(or seven) codon insertion was digested with a 
Spel/Xmal and gel-purified. The plasmid, pRRS.foJcIR 
from Example X was cleaved with SpeX/Xmal, and the 
30 large 3.9 kb fragment was gel-purified and ligated 
to the PCR product. The recombinant DNA was 
transfected into competent RR1 [phCYCfokIM] cells, 
and the desired clones identified as described in 
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Example X. The plasmids from these clones were 
isolated and sequenced to confirm the presence of 
the four (or seven) codon insertion within the fokIR 
gene . 

In particular, the construction of the 
mutants was performed as follows: (1) There are 
two Spel sites at nucleotides 162 and 1152, 
respectively, within the fokIR gene sequence. The 
site at 1152 is located near the trypsin cleavage 
site of Fokl that separates the recognition and 
cleavage domains. In order to insert the four (or 
seven) codons around this region , the other Spel 
site at 162 was mutagenized using a two step PCR 
technique (Landt et al. Gene 96:125-28, 1990). 
Introduction of this Spel site mutation in the fokIR 
gene does not affect the expression levels of the 
overproducer clones. (2) The insertion of four (or 
seven) codons was achieved using the PCR technique. 
The mutagenic primers used in the PCR amplification 
are shown in Figure 11. Each primer has a 21 bp 
complementary sequence to the fokIR gene. The 5' 
end of these primers are flanked by Spel sites. The 
codons for KSEL and KSELEEK repeats are incorporated 
between the Spel site and the 21 bp complement. 
Degenerate codons were used in these repeats to 
circumvent potential problems during PCR 
amplification. The other primer is complementary to 
the 3* end of the fokIR gene and is flanked by a 
XmaJ site. The PCR-generated 0.6 kb fragments 
containing the four (or seven) codon inserts 
digested with Spel/Xmal and gel-purified. These 
fragments were substituted into the high expression 
vector pBRSfoklR to generate the mutants. Several 
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clones of each mutant identified and their DNA 
sequence confirmed by Sanger's dideoxy chain 
termination method (Sanger et al. Proc. Natl» Acad. 
Sci. USA 74.5463-67 1977). 
5 Upon induction with 1 mM isopropyl B-D- 

thiogalactoside (IPTG) , the expression of mutant 
enzymes in these clones became most prominent at 3 
hrs as determined by SDS/PAGE. This was further 
supported by the assays for the enzyme activity. 

10 The levels of expression of the mutant enzymes in 
these clones were much lower compared to the wild- 
type Fo/cl. IPTG induction for longer times resulted 
in lower enzyme levels indicating that the mutant 
enzymes were actively degraded within these clones. 

15 This suggests that the insertion of four (or seven) 
codons between the recognition and cleavage domains 
of Fokl destabilizes the protein conformation making 
them more susceptible to degradation within the 
cells. SDS/PAGE profiles of the mutant enzymes are 

20 shown in Figure 12. 

Example XIT 

Preparation of DNA Substrates with a single FokT 
Site 

Two substrates, each containing a single 
25 FokX recognition site, were prepared by PCR using 
pTZ19R as the template. Oligonucleotide primers, 
5 1 -CGCAGTGTTATCACTCAT- 3 • and 5 ■ -CTTGGTTGAGTACTCACC- 
3 f (see SEQ ID NO:27 and SEQ ID NO:28, respectively), 
were used to synthesize the 100 bp fragment. 
30 Primers, 5 1 -ACCGAGCTCGAATTCACT- 3 1 and 5 f - 

GATTTCGGCCTATTGGTT-3 • (see SEQ ID NO: 29 and SEQ ID 
NO: 30, respectively), were used to prepare the 256 
bp fragment. Individual strands within these 
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substrates were radiolabled by using the 
corresponding 32 P-labeled phosphorylated primers 
during PCR. The products were purified from low- 
melting agarose gel, ethanol precipitated and 
5 resuspended in TE buffer. 

fixampl? XIII 

Analysis of the Sequence Specificity of the Mutant 

The agarose gel electrophoretic profile of 

10 the cleavage products of pTZ19R DNA by Fokl and the 
mutants are shown in Figure 13A. They are very 
similar suggesting that insertion of four (or seven) 
codons in the linker region between the recognition 
and cleavage domains does not alter its DNA sequence 

15 specificity. This was further confirmed by using 
32 P-labeled DNA substrates (100 bp and 256 bp) each 
containing a single Fokl site. Substrates 
containing individual strands labeled with K P were 
prepared as described in Example XII. Fokl cleaves 

20 the 256 bp substrate into two fragments, 180 bp and 
72 bp, respectively (Figure 13B) . The length of the 
fragments was calculated from the 32 P-labeled 5 1 end 
of each strand. The autoradiograph of the agarose 
gel is shown in Figure 13C. Depending on which 

25 strand carries the 32 P-label in the substrate, either 
72 bp fragment or 180 bp fragment appears as a band 
in the autoradiograph. The mutant enzymes reveal 
identical agarose gel profiles and autoradiograph. 
Therefore, insertion of four (or seven) codons 

30 between the recognition and cleavage domains does 
not alter the DNA recognition mechanism of FoJtl 
endonuclease. 
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Example XIV 
Analysis of the Cleavage Distances from the 
Recognition Site bv the Mutant Enzymes 

To determine the distance of cleavage by 
5 the mutant enzymes, their cleavage products of the 
^P-labeled substrates were analyzed by PAGE (Figure 
14). The digests were analyzed alongside the 
sequencing reactions of pTZ19R performed with the 
same primers used in PCR to synthesize these 
10 substrates. The cleavage pattern of the 100 bp 

fragment by Fokl and the mutants are shown in Figure 
14A. The cut sites are shifted from the recognition 
site on both strands of the substrates in the case 
of the mutants, as compared to the wild- type enzyme. 
15 The small observable shifts between the sequencing 
gel and the cleavage products are due to the 
unphosphorylated primers that were used in the 
sequencing reactions. 

On the 5 1 -GGATG-3 1 strand, both mutants 
20 cut the DNA 10 nucleotides away from the site while 
on the S'-CATCC-a 1 strand they cut 14 nucleotides 
away from the recognition site. These appear to be 
the major cut sites for both the mutants. A small 
amount of cleavage similar to the wild-type enzyme 
25 was is also observed. 

The cleavage pattern of the 256 bp 
fragment is shown in Figure 14B. The pattern of 
cleavage is shown in Figure 14B. The pattern of 
cleavage is similar to the 100 bp fragment. Some 
30 cleavage is seen 15 nucleotides away from the 

recognition site on the S'-CATCC-B 1 strand in the 
case of the mutants. The multiple cut sites for the 
mutant enzymes could be attributed to the presence 
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of different conformations in these proteins. Or 
due to the increased flexibility of the spacer 
region between the two domains. Depending on the 
DNA substrate, some variation in the intensity of 
cleavage at these sites was observed. This may be 
due to the nucleotide sequence around these cut 
sites. Naturally occurring Type IIS enzymes with 
multiple cut sites have been reported (Szybalski et 
al., Gene 100:13-26, 1991). 



10 TABLE 1 



Amino- terminal sequences of Fokl 
fragments from trypsin digestion 



Fragment Amino- terminal sequence DNA SEQ ID 



15 






substrate 


NO 




8 


kDa 


VSKTRTFG*VQNPGKFENLKRWQVFDRS - 


16 




58 


kDa 


SEAPCDAIIQ 


17 




25 


KDa 


QLVKSELEEK + 


18 


20 


41 


kDa 


VSKIRTFGWV 


19 




30 


kDa 


VSKIRTFGWV 


19 




11 


kDa 


FTRVPKRVY 


20 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Chandrasegaran , Srinivasan 



(ii) TITLE OF INVENTION: Functional Domains in 
Fokl Restriction Endonuclease 

(iii) NUMBER OF SEQUENCES: 40 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Cushman, Darby & Cushman 

(B) STREET: 1100 New York Ave., N.W. 

(C) CITY: Washington 

(D) STATE: D.C. 

(E) COUNTRY: USA 

(F) ZIP: 20005-3918 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version 

#1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Kokulis, Paul N. 

(B) REGISTRATION NUMBER: 16 f 773 

(C) REFERENCE/DOCKET NUMBER: 
PNK/4130/122364/CLB 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 202-861-3503 

(B) TELEFAX: 202-822-0944 

(C) TELEX: 6714627 CUSH 
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(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NI 
GGATG 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID Kl 
CCTAC 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 18.-35 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CCATGGAGGT TTAAAAT ATG AGA TTT ATT GGC AGC 

Met Arg Phe lie Gly Ser 
1 5 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 

Met Arg Phe lie Gly Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
ATACCATGGG AATTAAATGA CACAGCATCA 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 22.. 42 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

AGGATCCGG AGGTTTAAAA T ATG GTT TCT AAA ATA AGA ACT 42 

Met Val Ser Lys lie Arg Thr 
1 5 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:7i 

Met Val Ser Lys lie Arg Thr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
TAGGATCCTC ATTAAAAGTT TATCTCGCCG TTATT 35 
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(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Asn Asn Gly Glu lie Asn Phe 
1 5 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
CCTCTGGATG CTCTC 15 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GAGAGCATCC AGAGG 15 
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(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
TAATTGATTC TTAA 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
ATTAAGAATC AATT 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
CCTCTGGATG CTCTCAAAAA AAAAAAAAAA 
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(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
GAGAGCATCC AGAGGAAAAA AAAAAAAAAA 30 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Val Ser Lys lie Arg Thr Phe Gly Xaa Val Gin Asn Pro Gly 

15 10 
Lys Phe Glu Asn Leu Lys Arg Val Val Gin Val Phe Asp Arg 
15 20 25 

Ser 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Ser Glu Ala Pro Cys Asp Ala lie lie Gin 
15 10 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Gin Leu Val Lys Ser Glu Leu Glu Glu Lys 
15 10 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Val Ser Lys lie Arg Thr Phe Gly Trp Val 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Phe Thr Arg Val Pro Lys Arg Val Tyr 
1 5 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: 

Glu Glu Lys 
1 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Lys Ser Glu Leu 
1 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Lys Ser Glu Leu Glu Glu Lys 
1 5 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
TAGCAACTAA TTCTTTTTGG ATCTT 25 
(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
CCATCGATAT AGCCTTTTTT ATT 23 
(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
GCTCTAGAGG ATCCGGAGGT 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
CGCAGTGTTA TCACTCAT 
(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
CTTGGTTGAG TACTCACC 
(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 
ACCGAGCTCG AATTCACT 18 
(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
GATTTCGGCC TATTGGTT 18 
(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 579 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

Met Val Ser Lys lie Arg Thr Phe Gly Trp Val Gin Asn Pro 

15 10 
Gly Lys Phe Glu Asn Leu Lys Arg Val Val Gin Val Phe Asp 
15 20 25 

Arg Asn Ser Lys Val His Asn Glu Val Lys Asn He Lys He 

30 35 40 

Pro Thr Leu Val Lys Glu Ser Lys He Gin Lys Glu Leu Val 

45 50 55 

Ala He Met Asn Gin His Asp Leu He Tyr Thr Tyr Lys Glu 

60 65 70 

Leu Val Gly Thr Gly Thr Ser He Arg Ser Glu Ala Pro Cys 

75 80 
Asp Ala He He Gin Ala Thr He Ala Asp Gin Gly Asn Lys 
85 90 95 

Lys Gly Tyr He Asp Asn Trp Ser Ser Asp Gly Phe Leu Arg 
100 105 110 



SUBSTITUTE SHEET (RULE 26) 



WO 94/18313 



PCT/US94/01201 



63 



Trp 


Ala 


His 


Ala 


Leu 


Gly 


Phe 


He 


Glu Tyr He Asn Lys 


Ser 






115 










120 


125 




Asp 


Ser 


Phe 


Val 


He 


Thr 


Asp 


Val 


Gly Leu Ala Tyr Ser 


Lys 








130 










135 


140 


Ser 


Ala 


Asp 


Gly 


Ser 


Ala 


He 


Glu 


Lys Glu He Leu He 


Glu 






145 








150 




Ala 


lie 


Ser 


Ser 


Tyr 


Pro 


Pro 


Ala 


He Arg He Leu Thr 


Leu 


155 










160 






165 




Leu 


Glu 


Asp 


Gly 


Gin 


His 


Leu 


Thr 


Lys Phe Asp Leu Gly 


Lys 




170 










175 




180 




Asn 


Leu 


Gly 


Phe 


Ser 


Gly 


Glu 


Ser 


Gly Phe Thr Ser Leu 


Pro 






185 










190 


195 




Glu 


Gly 


He 


Leu 


Leu 


Asp 


Thr 


Leu 


Ala Asn Ala Met Pro 


Lys 








200 










205 


210 


Asp 


Lys 


Gly 


GlU 


He 


Arg 


Asn 


Asn 


Trp Glu Gly Ser Ser 


Asp 










215 








220 




Lys 


Tyr 


Ala 


Arg 


Met 


He 


Gly Gly 


Trp Leu Asp Lys Leu 


Gly 


225 










230 






235 




Leu 


val 


Lys 


Gin 


Gly 


Lys 


Lys 


GlU 


Phe He He Pro Thr 


Leu 




240 










245 




250 




Gly 


Lys 


Pro 


Asp 


Asn 


Lys 


Glu 


Phe 


He Ser His Ala Phe 


Lys 






255 










260 


265 




lie 


Thr 


Gly 


GlU 


Gly 


Leu 


Lys Val 


Leu Arg Arg Ala Lys 


Gly 








270 










275 


280 


Ser 


Thr 


Lys 


Phe 


Thr 


Arg 


Val 


Pro 


Lys Arg Val Tyr Trp 


Glu 










285 








290 




Met 


Leu 


Ala 


Thr 


Asn 


Leu 


Thr Asp 


Lys Glu Tyr Val Arg 


Thr 


295 










300 






305 




Arg 


Arg 


Ala 


Leu 


He 


Leu 


Glu 


He 


Leu He Lys Ala Gly 


Ser 




310 










315 




320 




Leu 


Lys 


He 


Glu 


Gin 


He 


Gin 


Asp 


Asn Leu Lys Lys Leu 


Gly 






325 










330 


335 




Phe 


Asp 


Glu 


Val 


He 


GlU 


Thr 


He 


Glu Asn Asp He Lys 


Gly 








340 










345 


350 


Leu 


He 


Asn 


Thr 


Gly 


He 


Phe 


He 


Glu He Lys Gly Arg 


Phe 










355 








360 




Tyr 


Gin 


Leu 


Lys 


Asp 


His 


He 


Leu 


Gin Phe Val He Pro 


Asn 


365 










370 






375 




Arg 


Gly 


Val 


Thr 


Lys 


Gin 


Leu 


Val 


Lys Ser Glu Leu Glu 


Glu 




380 










385 




390 




Lys 


Lys 


Ser 


Glu 


Leu 


Arg 


His 


Lys 


Leu Lys Tyr Val Pro 


His 






395 










400 


405 




Glu 


Tyr 


He 


Glu 


Leu 


He 


Glu 


He 


Ala Arg Asn Ser Thr 


Gin 






410 










415 


420 


Asp 


Arg 


He 


Leu 


Glu 


Met 


Lys Val 


Met Glu Phe Phe Met 


Lys 










425 








430 




Val 


Tyr 


Gly 


Tyr 


Arg 


Gly 


Lys His 


Leu Gly Gly Ser Arg 


Lys 


435 










440 






445 




Pro 


Asp 


Gly 


Ala 


He 


Tyr 


Thr 


Val 


Gly Ser Pro He Asp 


Tyr 




450 










455 




460 
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Gly 


Val 


He 


Val 


Asp 


Thr 


Lys 


Ala Tyr Ser Gly Gly 


Tyr 


Asn 






465 










470 


475 




Leu 


Pro 


He 


Gly 


Gin 


Ala 


Asp 


Glu Met Gin Arg Tyr 


Val 


Glu 








480 








485 




490 


Glu 


Asn 


Gin 


Thr 


Arg 


Asn 


Lys 


His He Asn Pro Asn 


Glu 


Trp 










495 






500 




Trp 


Lys 


Val 


Tyr 


Pro 


Ser 


Ser 


Val Thr Glu Phe Lys 


Phe 


Leu 


505 










510 




515 






Phe 


Val 


Ser 


Gly 


His 


Phe 


Lys 


Gly Asn Tyr Lys Ala 


Gin 


Leu 




520 










525 


530 






Thr 


Arg 


Leu 


Asn 


His 


He 


Thr 


Asn Cys Asn Gly Ala 


Val 


Leu 






535 










540 


545 




Ser 


Val 


Glu 


Glu 


Leu 


Leu 


He 


Gly Gly Glu Met He 


Lys 


Ala 








550 








555 




560 


Gly 


Thr 


Leu 


Thr 


Leu 


Glu 


Glu 


Val Arg Arg Lys Phe 


Asn 


Asn 










565 






570 






Gly 


Glu 


He 


Asn 


Phe 













575 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Lys Gin Leu Val Lys Ser Glu Leu Glu Glu Lys 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
AAGCAACTAG TCAAAAGTGA ACTGGAGGAG AAG 33 
(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Leu Val Lys Ser Glu Leu Lys Ser Glu Leu Glu Glu Lys 
15 10 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 
GGACTAGTCA AATCTGAACT TAAAAGTGAA CTGGAGGAGA AG 42 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

Leu Val Lys Ser Glu Leu Glu Glu Lys Lys Ser Glu Leu Glu 

15 io 
Glu Lys 
15 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

GGACTAGTCA AATCTGAACT TGAGGAGAAG AAAAGTGAAC 
TGGAGGAGAA G 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) . STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:38: 

Asn Phe Xaa Xaa 
1 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39 
TTGAAAATTA CTCCTAGGGG CCCCCCT 
(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40 
GGATGNNNNNNNNNNNNNNNNNN 
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* * * * * 

All publications mentioned hereinabove are 
hereby incorporated by reference. 

While the foregoing invention has been 
described in some detail for purposes of clarity and 
understanding, it will be appreciated by one skilled 
in the art that various changes in form and detail 
can be made without departing from the true scope of 
the invention. 
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WHAT IS CLAIMED IS : 

1. An isolated DNA segment encoding the 
recognition domain of a Type IIS endonuclease which 
contains the sequence-specific recognition activity 
of said Type IIS endonuclease. 

2. The DNA segment of claim 1 wherein 
said Type IIS endonuclease is Fokl restriction 
endonuclease. 

3. The DNA segment of claim 2 which 
encodes amino acids 1-382 of the FoJcI restriction 
endonuclease . 

4. An isolated DNA segment encoding the 
catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of said Type IIS 
endonuclease. 

5. The DNA segment of claim 4 wherein 
said Type IIS endonuclease is Fokl restriction 
endonuclease. 

6. The DNA segment of claim 5 which 
encodes amino acids 383-578 of the Fokl restriction 
endonuclease . 

7. An isolated protein consisting 
essentially of the N-terminus of the Fokl 
restriction endonuclease which protein has the 
sequence-specific recognition activity of said 
endonuclease. 
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8. An isolated protein consisting 
essentially of the C-terrainus of the Fokl 
restriction endonuclease which protein has the 
cleavage activity of said endonuclease. 

9. A DNA construct comprising: 

(i) a first DNA segment encoding the 
catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of said Type IIS 
endonuclease; 

(ii) a second DNA segment encoding a 
sequence-specific recognition domain other than the 
recognition domain of said Type IIS endonuclease; 
and 

(iii) a vector 

wherein said first DNA segment and said 
second DNA segment are operably linked to said 
vector so that a single protein is produced. 

10. The DNA construct according to claim 

9 wherein said Type IIS endonuclease is Fokl 
restriction endonuclease. 

11. The DNA construct according to claim 

10 wherein said recognition domain is selected from 
the group consisting of: zinc finger motifs, homeo 
domain motifs, DNA binding domains of repressors, 
DNA binding domains of oncogenes and naturally 
occurring sequence-specific DNA binding proteins 
that recognize >6 base pairs. 



SUBSTITUTE SHEET (RULE 26) 



WO 94/18313 



PCT/US94/01201 



71 

12. A procaryotic cell comprising: 

(i) a first DNA segment encoding the 
catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of said Type IIS 
endonuclease; 

(ii) a second DNA segment encoding a 
sequence-specific recognition domain other than the 
recognition domain of said Type IIS endonuclease; 
and 

(iii) a vector 

wherein said first DNA segment and said second DNA 
segment are operably linked to said vector so that a 
single protein is produced. 

13. A hybrid restriction enzyme 
comprising the catalytic domain of a Type IIS 
endonuclease which contains the cleavage activity of 
said Type IIS endonuclease covalently linked to a 
recognition domain of a protein other than said Type 
IIS endonuclease. 

14. The hybrid restriction enzyme of 
claim 13 wherein said recognition domain which 
comprises part of said hybrid restriction enzyme is 
selected from the group consisting of: zinc finger 
motifs, homeo domain motifs, DNA binding domains of 
repressors, DNA binding domains of oncogenes and 
naturally occurring sequence-specific DNA binding 
proteins that recognize >6 base pairs. 

15. A DNA construct comprising: 

(i) a first DNA segment encoding the 
catalytic domain of a Type IIS endonuclease which 
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contains the cleavage activity of said Type IIS 
endonuclease; 

(ii) a second DNA segment encoding a 
sequence-specific recognition domain other than the 
recognition domain of said Type IIS endonuclease; 

(iii) a third DNA segment comprising one 
or more codons, wherein said third DNA segment is 
inserted between said first DNA segment and said 
second DNA segment; and 

(iv) a vector 

wherein said first DNA segment, said 
second DNA segment and said third DNA segment are 
operably linked to said vector so that a single 
protein is produced. 

16. The DNA construct according to claim 

15 wherein said Type IIS endonuclease is Fokl 
restriction endonuclease. 

17 . The DNA construct according to claim 

16 wherein said third DNA segment consists 
essentially of four codons. 

18. The DNA construct according to claim 

17 wherein said four codons of said third DNA 
segment are inserted at nucleotide 1152 of the gene 
encoding said endonuclease. 

19. The DNA construct according to claim 
16 wherein said third DNA segment consists 
essentially of 7 codons. 
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20. The DNA construct according to claim 
19 wherein said 7 codons of said third DNA segment 
are inserted at nucleotide 1152 of the gene encoding 
said endonuclease . 

21. The DNA construct according to claim 
16 wherein said recognition domain is selected from 
the group consisting of: zinc finger motifs, homeo 
domain motifs, DNA binding domains of repressors, 
DNA binding domains of oncogenes and naturally 
occurring sequence-specific DNA binding proteins 
that recognize >6 base pairs. 

22. A procaryotic cell comprising: 

(i) a first DNA segment encoding the 
catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of said Type IIS 
endonuclease; 

(ii) a second DNA segment encoding a 
sequence-specific recognition domain other than the 
recognition domain of said Type IIS endonuclease; 

(iii) a third DNA segment comprising one 
or more codons, wherein said third DNA segment is 
inserted between said first DNA segment and said 
second DNA segment; and 

(iv) a vector 

wherein said first DNA segment, said 
second DNA segment, and said third DNA segment are 
operably linked to said vector so that a single 
protein is produced. 
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23. The procaryotic cell of claim 22 
wherein said third DNA segment consists essentially 
of four codons. 

24. The procaryotic cell of claim 22 
wherein said third DNA segment consists essentially 
of seven codons. 

25. An isolated protein produced by the 
procaryotic cell of claim 22. 

26. An isolated DNA segment encoding the 
N-terminus of a Type IIS endonuclease which contains 
the sequence-specific recognition activity of said 
Type II endonuclease, said Type II endonuclease 
being Fokl restriction endonuclease and having a 
molecular weight of about 41 kilodaltons as measured 
by SDS-polyacrylamide gel electrophoresis. 

27. An isolated DNA segment encoding the 
Oterminus of a Type IIS endonuclease which contains 
the cleavage activity of said Type IIS endonuclease, 
said Type II endonuclease being Fokl restriction 
endonuclease and having a molecular weight of about 
25 kilodaltons as determined by SDS-polyacrylamide 
gel electrophoresis. 

28. An isolated protein consisting 
essentially of the N-terminus of the Fok restriction 
endonuclease which protein has the sequence-specific 
recognition activity of said endonuclease and which 
protein is amino acids 1-382 of said Fok restriction 
endonuclease . 



SUBSTITUTE SHEET (RULE 26) 



WO 94/18313 



PCT/US94/01201 



75 

29. An isolated protein consisting 
essentially of the C-terminus of the Fokl 
restriction endonuclease which protein has the 
nuclease activity of said endonuclease and which 
protein is amino acids 383-578 of said Fokl 
restriction endonuclease. 



SUBSTITUTE SHEET (RULE 26) 



PCT/US94/01201 

76 

AMENDED CLAIMS 
[received by the International Bureau 
11 July 1994 (11.07.94); original claims 1-8 replaced by new 
claims 1 and 2; original claims 9-25 and 28,29 renumbered 
as new claims 3-21 and 22,23; original claims 
26 and 27 cancelled (6 pages)] 

1. An isolated DNA segment encoding the 
N-terminus of a Type IIS endonuclease which contains 
the sequence-specific recognition activity of said 
Type IIS endonuclease, said Type IIS endonuclease 
being FoJcI restriction endonuclease and said N- 
terminus having a molecular weight of about 41 
kilodaltons as determined by SDS-polyacrylamide gel 
electrophoresis wherein said isolated DNA segment 
encodes amino acids 1-382 of said Fokl restriction 
endonuclease. 

2. An isolated DNA segment encoding the 
C-terminus of a Type IIS endonuclease which contains 
the cleavage activity of said Type IIS endonuclease, 
said Type IIS endonuclease being Fokl and said C- 
terminus having a molecular weight of about 25 
kilodaltons, as determined by SDS-polyacrylamide gel 
electrophoresis, wherein said isolated DNA segment 
encodes amino acids 383-578 of said Fokl restriction 
endonuclease. 

3. A DNA construct comprising: 

(i) a first DNA segment encoding the 
catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of said Type IIS 
endonuclease ; 

(ii) a second DNA segment encoding a 
sequence-specific recognition domain other than the 
recognition domain of said Type IIS endonuclease; 
and 

(iii) a vector 



WO 94/18313 



AMENDED SHEET (ARTICLE 19) 



WO 94/18313 



PCT/US94/01201 



77 

wherein said first DNA segment and said 
second DNA segment are operably linked to said 
vector so that a single protein is produced. 

4 . The DNA construct according to claim 3 
wherein said Type IIS endonuclease is Fokl 
restriction endonuclease. 

5. The DNA construct according to claim 4 
wherein said recognition domain is selected from the 
group consisting of: zinc finger motifs, homeo 
domain motifs, DNA binding domains of repressors, 
DNA binding domains of oncogenes and naturally 
occurring sequence-specific DNA binding proteins 
that recognize >6 base pairs. 

6. A procaryotic cell comprising: 

(i) a first DNA segment encoding the 
catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of said Type IIS 
endonuclease; 

(ii) a second DNA segment encoding a 
sequence-specific recognition domain other than the 
recognition domain of said Type IIS endonuclease; 
and 

(iii) a vector 

wherein said first DNA segment and said second DNA 
segment are operably linked to said vector so that a 
single protein is produced. 

7. A hybrid restriction enzyme comprising 
the catalytic domain of a Type IIS endonuclease 
which contains the cleavage activity of said Type 
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IIS endonuclease covalently linked to a recognition 
domain of a protein other than said Type IIS 
endonuclease. 

8. The hybrid restriction enzyme of claim 
7 wherein said recognition domain which comprises 
part of said hybrid restriction enzyme is selected 
from the group consisting of: zinc finger motifs, 
homeo domain motifs, DNA binding domains of 
repressors, DNA binding domains of oncogenes and 
naturally occurring sequence-specific DNA binding 
proteins that recognize >6 base pairs. 

9. A DNA construct comprising: 

(i) a first DNA segment encoding the 
catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of said Type IIS 
endonuclease ; 

(ii) a second DNA segment encoding a 
sequence-specific recognition domain other than the 
recognition domain of said Type IIS endonuclease; 

(iii) a third DNA segment comprising one 
or more codons, wherein said third DNA segment is 
inserted between said first DNA segment and said 
second DNA segment; and 

(iv) a vector 

wherein said first DNA segment, said 
second DNA segment and said third DNA segment are 
operably linked to said vector so that a single 
protein is produced. 
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10. The DNA construct according to claim 

9 wherein said Type IIS endonuclease is FoJcI 
restriction endonuclease. 

11. The DNA construct according to claim 

10 wherein said third DNA segment consists 
essentially of four codons. 

12. The DNA construct according to claim 

11 wherein said four codons of said third DNA 
segment are inserted at nucleotide 1152 of the gene 
encoding said endonuclease. 

13 . The DNA construct according to claim 
10 wherein said third DNA segment consists 
essentially of 7 codons. 

14. The DNA construct according to claim 
13 wherein said 7 codons of said third DNA segment 
are inserted at nucleotide 1152 of the gene encoding 
said endonuclease. 

15. The DNA construct according to claim 
10 wherein said recognition domain is selected from 
the group consisting of: zinc finger motifs, homeo 
domain motifs, DNA binding domains of repressors, 
DNA binding domains of oncogenes and naturally 
occurring sequence-specific DNA binding proteins 
that recognize >6 base pairs. 

16. A procaryotic cell comprising: 
(i) a first DNA segment encoding the 

catalytic domain of a Type IIS endonuclease which 
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contains the cleavage activity of said Type IIS 
endomiclease ; 

(ii) a second DNA segment encoding a 
sequence-specific recognition domain other than the 
recognition domain of said Type IIS endonuclease; 

v (iii) a third DNA segment comprising one 
or more codons, wherein said third DNA segment is 
inserted between said first DNA segment and said 
second DNA segment; and 

(iv) a vector 

wherein said first DNA segment, said 
second DNA segment, and said third DNA segment are 
operably linked to said vector so that a single 
protein is produced. 

17. The procaryotic cell of claim 16 
wherein said third DNA segment consists essentially 
of four codons. 

18. The procaryotic cell of claim 16 
wherein said third DNA segment consists essentially 
of seven codons. 

19. An isolated hybrid Type IIS 
endonuclease produced by the procaryotic cell of 
claim 16. 

20. An isolated DNA segment encoding the 
N-terminus of a Type IIS endonuclease which contains 
the sequence-specific recognition activity of said 
Type II endonuclease, said Type II endonuclease 
being Fokl restriction endonuclease and having a 
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molecular weight of about 41 kilodaltons as measured 
by SDS-polyacrylamide gel electrophoresis. 

21. An isolated DNA segment encoding the 
C-terminus of a Type IIS endonuclease which contains 
the cleavage activity of said Type IIS endonuclease, 
said Type II endonuclease being FoJcI restriction 
endonuclease and having a molecular weight of about 
25 kilodaltons as determined by SDS-polyacrylamide 
gel electrophoresis. 

22. An isolated protein consisting 
essentially of the N-terminus of the Fok restriction 
endonuclease which protein has the sequence-specific 
recognition activity of said endonuclease and which 
protein is amino acids 1-382 of said Fok restriction 
endonuclease. 

23. An isolated protein consisting 
essentially of the C-terminus of the Fokl 
restriction endonuclease which protein has the 
nuclease activity of said endonuclease and which 
protein is amino acids 383-578 of said Fokl 
restriction endonuclease. 
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Figure 1 

FokJM 
5' primer 

Ncol 7-bp spacer 

5' TA CCA TGG AGGT TTAAAAT ATG AGA TTT ATT GGC AGC 
RBS Met Arg Phe lie Gly Ser 

3' primer 

18-bp complement Ncol 

3' ACT ACG ACA CAG TAA ATT AAG GGTACC ATA 5' 

FoklR 
5' primer 

BamHl RBS 7-bp spacer 

5' TA GGATCC GGAGGT TTAAAAT ATG GTT TCT AAA ATA AGA ACT 

Met Val Ser Lys lie Arg Thr 



3' primer 



Complementary Strand BamHl 

3' TTA TIG CCG CTC TAT TTG AAA ATT ACT CC TAGG AT 5' 
Asn Asn Gly Glu De Asn Phe 
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FIG. 5 
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FIG.6A 
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FIG. 8 
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FIGURE 9 
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FIGURE 10 
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Figure 11 
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FIG. 12 
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FIG. ISA 
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FIG. I3B 
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FIG. I3C 



23456789 10 11 



SUBSTITUTE SHEET (RULE 26) 



WD 94/13313 



17/21 



PCT/US94/01201 



FIG. I4A 
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FIG. I5A 

(A) wild-type Foil 
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FIG. I5B 



(B) 4 -cod on insertion mutant 
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FIG. I5C 

(C) 7-codon insertion mutant 
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